Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow batching the output of a join #2310

Merged
merged 10 commits into from
May 3, 2021

Conversation

revans2
Copy link
Collaborator

@revans2 revans2 commented Apr 29, 2021

This is the first step for out of core join. This at least partially addresses #20

This depends on rapidsai/cudf#8118 to go in first.

For most cases that I have tested this is strictly better than what was before. If the output of the join fits in the output batch size then the join will happen just like it does today. If the output is larger than that we now can output it in multiple batches. The problem that I have found is that the gather map is not spillable and after a single batch is output the GPU Semaphore is released. This means that for contrived joins that explode evenly, each active task will have a potentially large gather map in memory. I think I can make it spillable without a lot of work. If I can then I might just do it. But I also want to spend some time running benchmarks to see if this can help fix some of the exploding join issues have have seen there.

Signed-off-by: Robert (Bobby) Evans <[email protected]>
@revans2 revans2 added feature request New feature or request cudf_dependency An issue or PR with this label depends on a new feature in cudf labels Apr 29, 2021
@revans2 revans2 added this to the Apr 26 - May 7 milestone Apr 29, 2021
@revans2 revans2 self-assigned this Apr 29, 2021
Copy link
Collaborator Author

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to add in some more java docs too.

@revans2
Copy link
Collaborator Author

revans2 commented Apr 30, 2021

I have been adding in spilling for the gather maps which let me push things a bit further and found a bug in the gather implementation.

rapidsai/cudf#8121

Copy link
Collaborator

@abellina abellina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass @revans2

@revans2
Copy link
Collaborator Author

revans2 commented May 1, 2021

I think I have addressed all of the review comments. I put together my own quick hack for rapidsai/cudf#8121 and it is not enough to be able to run q72. Even at a batch size of 26m and 200 partitions it took over 6 mins to finish one of the join tasks our of 200 and failed on the next one. We are going to have to really think about what we want to try and do to support query 72. But all of the others run with reasonable configurations.

@revans2 revans2 marked this pull request as ready for review May 1, 2021 22:45
revans2 added 2 commits May 3, 2021 10:09
Tests on not passing with struct joins need to do some more debugging
@revans2
Copy link
Collaborator Author

revans2 commented May 3, 2021

build

@revans2
Copy link
Collaborator Author

revans2 commented May 3, 2021

I upmerged and had to update the code for the new struct join support. A good thing too because it exposed a bug in my filtering code. It would only have been a performance regression before the struct code, but afterwards it became an error. This should be all ready to go now. The dependency is merged.

@revans2 revans2 merged commit 8acac67 into NVIDIA:branch-0.6 May 3, 2021
@revans2 revans2 deleted the initial_out_of_core_join branch May 3, 2021 18:25
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cudf_dependency An issue or PR with this label depends on a new feature in cudf feature request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants